🐿️ Scour
Browse
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
📉 Model Quantization
Model Compression, Inference Optimization, Edge Deployment, Performance
Filter Results
Timeframe
Hot
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
4620
posts in
110.9
ms
The Transformer Architecture: A Deep Dive into How LLMs Actually Work
dev.to
·
4h
·
Discuss:
DEV
📝
NLP
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
wwes4/AI_Accel_1.5x: AI acceleration framework for ~1.5x speedups in mid-sized models via tension-based pruning. Built utilizing xAI's Grok.
github.com
·
1d
·
Discuss:
Hacker News
📱
Edge AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
[Discussion] The "Noise" Bottleneck in Local 8B RAG – A comparison of cleaning strategies (Regex vs. Unstructured vs. Entropy)
reddit.com
·
5h
·
Discuss:
r/LocalLLaMA
🌸
Bloom Filters
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
TOON for LLMs: A Comparative Performance Analysis against JSON
gist.github.com
·
8h
·
Discuss:
DEV
💬
Prompt Engineering
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
is this legit? Supposedly LangVAE straps a VAE + compression algorithm onto any LLM image, reduces resource requirements by up to...
arxiv.org
·
3d
·
Discuss:
r/LocalLLaMA
📱
Edge AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Qwen2 Technical Report
paperium.net
·
1d
·
Discuss:
DEV
💬
Prompt Engineering
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Pandas vs Polars: Why the 2025 Evolution Changes Everything
dev.to
·
9h
·
Discuss:
DEV
📓
Jupyter Notebooks
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Streamlinear, a new MCP for Linear
blog.fsck.com
·
1d
🏔️
Alpine.js
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Performance Hints for BigQuery
trmlabs.com
·
4h
·
Discuss:
Hacker News
🗄️
Databases
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Yann LeCun’s VL-JEPA: The breakthrough that gives AI a "Mind's Eye" (instead of just a mouth).
hisohan.substack.com
·
6h
·
Discuss:
Substack
📱
Edge AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Introducing the XLab AI Security Guide
lesswrong.com
·
7h
🛡️
AI Security
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Treating Functions as Vectors in Hilbert Space
hackaday.com
·
1d
🔢
Embeddings
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Show HN: Why is ML inference still so ad-hoc in practice?
news.ycombinator.com
·
1d
·
Discuss:
Hacker News
🧩
LLM Integration
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Does the QUIC handshake require compression to be fast?
fastly.com
·
1d
·
Discuss:
Hacker News
🔒
Digital Privacy
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
I made a CLI to train LLMs in 2 commands (no PyTorch boilerplate)
github.com
·
2d
·
Discuss:
r/LocalLLaMA
💬
Prompt Engineering
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
WebCC: A C++ framework and toolchain that batches API calls to reduce WASM/JS overhead
reddit.com
·
14h
·
Discuss:
r/opensource
🕸️
WASM
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Python 3.6-3.14 Performance on M1, M5 and Zen2
crewtech.se
·
1d
·
Discuss:
Hacker News
⚡
FastAPI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
What Is ChatGPT Doing?
vibediary.dev
·
2d
·
Discuss:
Hacker News
🔢
Embeddings
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
NIPS 2016 Tutorial: Generative Adversarial Networks
paperium.net
·
2d
·
Discuss:
DEV
🖼️
Dual Coding
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Your Team Uses AI. Why Aren't You 10x Faster?
bits.logic.inc
·
5h
·
Discuss:
Hacker News
⚡
AI-Driven DevOps
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Loading...
Loading more...
Page 2 »